Hellman-Feynman operator sampling in Diffusion Monte Carlo calculations 
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Diffusion Monte Carlo (DMC) calculations typically yield highly accurate results in solid-state 
and quantum-chemical calculations. However, operators that do not commute with the Hamiltonian 
are at best sampled correctly up to second order in the error of the underlying trial wavefunction, 
once simple corrections have been applied. This error is of the same order as that for the energy in 
variational calculations. Operators that suffer from these problems include potential energies and 
the density. This paper presents a new method, based on the Hellman-Feynman theorem, for the 
correct DMC sampling of all operators diagonal in real space. Our method is easy to implement in 
any standard DMC code. 



Diffusion Monte Carlo (DMC) is widely used for the 
computation of properties of solids and molecules 
Frequently, it is used as a check on other methods Q or 
even as an input It is therefore very important that 
DMC be as accurate as possible. However, other than 
for the total energy, standard DMC calculations are not 
as definitive as one would hope, since operators that do 
not commute with the Hamiltonian cannot be sampled 
exactly within standard DMC. Here we present a simple 
yet effective addition to standard DMC that plugs that 
gap and is easy to implement. 

DMC by construction yields the normalized expecta- 



tion value {0) DM c = {^T\0\% n )/(^ T \% n }, which is 
generally not the true ground-state expectation value 
(6) = (*o|0|* )/(*o|*o)- In fact, it is not even 
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the ground-state ex- 



pectation value constrained by a nodal structure of the 
fermionic many-body wavefunction that is given by ^t- 
is a trial wavefunction that approximates the gener- 
ally unknown ground-state wavefunction Wo an d is real. 
In its basic and most common form is the ground 
state for a fixed nodal structure given by that of Wp. 
In addition to this fixed-node approximation operators 
that do not commute with the Hamiltonian are gen- 
erally subject to a further error, the leading term of 
which is linear in the difference between an d Wq. 
In conjunction with Variational Monte Carlo (VMC), 
this error can be reduced by one order Q by using 
(O)cDMC = 2(6) dmc - (O)vmc- Correct sampling can 
be achieved, e.g. by using forward walking [g], repta- 
tion Monte Carlo [fj, and other methods [7|. Many of 
these methods aim to sample ^q™^^", rather than the 
usual DMC distribution W^Wq. They are therefore not 
straight forward additions to the DMC algorithm. Al- 
ternatively, the Virial theorem or the related Hellman- 
Feynman (HF) theorem [al can be used to evaluate op- 
erator expectation values [9|] which in the case of DMC, 
however, involves numerical derivatives of noisy data. 



In this Letter, we present a method based on the ap- 
plication of the HF theorem to the DMC algorithm di- 
rectly. Our method - Hellman-Feynman sampling (HFS) 
- can be tagged onto the usual sampling of operators with 
nearly no extra computational overhead. The aim is to 
maintain the basic DMC algorithm that samples WtWq™. 
Now, the total energy is evaluated correctly within stan- 
dard DMC, and crucially operator expectation values can 
be cast as HF derivatives of the total energy. Keeping in 
mind that ultimately the DMC algorithm is nothing but 
a large sum that yields the total energy, we see the HF 
derivative can be applied without problem to the algo- 
rithm itself! One advantage over numerical derivatives is 
that the resulting formula can handle several operators 
simultaneously in a single DMC run and maintaining or- 
bital occupancy for perturbed Hamiltonians ceases to be 
a problem. The DMC algorithm only involves numbers, 
so non-commutability of operators - the source of the dif- 
ficulties - is no issue either. Writing down the DMC al- 
gorithm as a mathematical formula and applying the HF 
derivative to it yields an object that when sampled using 
standard DMC produces the exact operator expectation 
value. It has to by construction. 

In the following, we present a schematic overview of 
the DMC algorithm, which however is sufficient to de- 
rive the relevant formulas. The basic idea of DMC is 
to split the imaginary-time propagator exp(— AtH) w 
exp(— AtT) exp(— AtV) for sufficiently small time inter- 
vals At into a kinetic and potential term and then to 
iterate it. This ultimately [lfj| gives rise to a real-space 
drift-diffusion process sampled using Monte Carlo (MC) , 
augmented by an exponential growth term whereby N w 
so-called walkers are propagated in parallel. Courtesy 
of this growth term, at each propagation or (imaginary) 
time step i the walker j acquires a multiplicative weight: 



- A *K,-4?) wh ere Eh = 



H^t/^t evaluated at the 



real-space position of walker j at time step i and E® is 
an estimate for the ground-state energy also at time step 
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i. The total weight of walker j at time step % becomes 

= IT e- At «i- B «°), wherei?, = i^lj' (1) 
fc=i * j=i 

and the presence of £^ ensures normalization. At time 
step i the estimator for an operator that a DMC code 
yields is 



of 



(2) 



where Ofj = O^t/^t and the wavefunction is eval- 
uated for walker j at time step i. For brevity, we use this 
bar-average Of where applicable and note that Of has 
to be averaged over all i to yield the final DMC estimate 
(O)dmc- Since the ground-state energy is not known, 
an estimate chosen such that Eq. ([1]) remains normal- 
ized has to be used. This is the growth estimator Ef [ll| 
and is updated at each step, hence the index i. Note 
that Ef is independent of j, i.e. it is the same for ev- 
ery walker and thus a property of the DMC process as a 
whole. For reasons of numerical stability, DMC is imple- 
mented by allowing walkers to die or multiply such that 
the walker's survival probability optionally augmented by 
residual weights corresponds to Eq. JT]). 

Given a perturbed Hamiltonian H{a) — H + aO and 
the associated fixed- node ground state energy Ef n (a) = 
{H)dmc-, first-order perturbation theory for 'Jq™ yields 
a fixed node equivalent of the HF theorem 



(O)fn 



dE fn (a) 



(3) 



a=0 



where (O) /„ converges to the correct ground-state ex- 
pectation value as the nodes of become exact though 
*S?t itself need not. Note that while (H)dmc — {H)fn 
we have (6) DM c + (0)f n , unless [6,H] = 0, so Eq. © 
is not trivial. The energy Ef n (a) is accessible exactly 
within standard DMC as the Hamiltonian H(a) com- 
mutes with itself. Analytic operator estimators can then 
be derived by applying the HF theorem to the formula 
expressing the DMC algorithm Eq. @. Using Eqs. ([T]) 
and ^ the expectation value at time step i becomes 

N w i 

Ei (a) = J2 E U a ) IT e- At K, W- B °W). (4) 
j k=i 

Here, Efc (a) = Efc + aOfa and Ef{a) = Ef + AEf(a), 
so the weight of the wavefunction is 



N w / i N 



exp(-At£(aO^-AI?°(a))J (5) 

(6) 



k=l 



exp (-taXi) exp (tAEf(a)) 



where Xij = \ J2k=i Of ^ an d * = anc ^ we 

have made use of the fact that the growth estimator 
AEf(a) is independent of the index j. AEf(a) ensures 
that f2j(a) = 1 to all orders of a, hence AEf(a) = 



j log exp (— taXi) . Eq. ([4]) then becomes 



Ei(a) 



Ef(a)e- taX * 

„-taXi 



(7) 



Evaluating AEf to first order gives the growth estimator 
of an operator: 



O 



GR 



dEf(c 



da 



dAEf(a) 



da 



= X t . (8) 



In other words, the DMC sampling of Xij by virtue of 
the HF theorem yields a growth estimator of the true 
expectation value of O. Interestingly, the growth estima- 
tor, if the residual weights are chosen to be zero, appears 
to be similar to Eq. (13) of Ref. 7]. Applying the HF 
theorem to the energy estimator Eq. |7]) yields a second 
estimator 



Of 



dE t {a) 



da 



Of -t EfXi - Ef ■ X, 



a=0 



(9) 



Equations ([5]) and © are of course evaluated at a — 
and are therefore accessible in a regular DMC cal- 
culation. We see that for Of the standard estima- 
tor Of is augmented by a correction term AOf = 
—t [EfXi — Ef ■ Xij . Several observations can be 
made. First, in the case of the being the ground 
state vPq™ for a given nodal structure the correction term 
is zero (Efj is a constant!) and only Of contributes as 
it should. Furthermore, the new estimator Of and the 
usual one Of sample an observable and are both inde- 
pendent of the auxiliary DMC parameter t. It follows 
that EfXi -Ef-X~i~ \. Thirdly, since the growth 
estimator Eq. {5J is derived from the "averaged" quan- 
tity Ef rather than Ef, Eq. © is itself already averaged 
over i and therefore the final estimate at i. This is in 
contrast to Eq. ([9]) which still has to be averaged over 
all i to yield the final DMC estimate. Using Ef yields an 
estimator Of R which when averaged over i gives Of R . 
Finally, within the fixed-node approximation the correc- 
tion term in Eq. ([9]) can be viewed as a direct measure 
of the error of the trial wavefunction with respect to a 
certain operator. In the remainder of the paper we will 
only discuss the direct estimator Eq. ([9|). 

An important question is which operators are admissi- 
ble and can be sampled using the HF estimator Eq. © 
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or for that matter the growth estimator Eq. {SJ? Look- 
ing at the definition of the DMC algorithm one sees that 
it is based on splitting the Hamiltonian into a kinetic 
energy kernel that gives rise to the diffusion part of the 
algorithm and a potential energy term that has to be 
diagonal in real space. The diffusion part always being 
the same it follows that AH = aO has to be diagonal 
in real space too. Using for example O l = T L = 
therefore actually corresponds to sampling the real space 
many-body operator given by the function T L , rather 
than the kinetic energy. The result using Eq. ^ is 



I 



fn 



dV which in general is not the desired ex- 



fn 



dV . Neverthe- 



pectation value (T) f n = J f n 

less, (T) /„ is accessible within DMC by using (T) f n 

1 1 j n — (V) f n since the last two quantities can be sam- 
pled using standard DMC and HFS, respectively. 

In the following, we give a few examples to demonstrate 
the applicability of HFS. We apply the method to sam- 
ple (i) the density of Helium and (ii) the Ewald energy 
of a homogeneous electron gas with and without inter- 
actions. All data are given in atomic units and we used 
the CASINO [l3| package. The target for the number of 
walkers was between 200 and 400 and the residual weights 
were allowed to fluctuate between 0.5 and 2. While we 
did not perform extensive studies it seems the algorithm 
works with and without residual weights. The only mod- 
ification to the code consisted of adding a variable X to 
each walker, updating X and applying Eq. Other 
than that we used the code as-is in a standard setup. 

Figure Q] shows the electron density (arbitrary units) 
of He, as obtained from standard DMC and from our HF 
method. When the well-converged (i.e. AOf <C Of) 
correlated wavefunction supplied with CASINO is used 
both calculations yield essentially the same result (solid 
line); when an "incorrect" trial wavefunction (which we 
have chosen to be the same as the "correct" one but 
with the radial term heavily skewed) is used, only our 
new method (dotted line) recovers the correct density, 
albeit the noise in the data is larger. Equally, the interac- 
tion energy is also recovered (DMC correct wavefunction: 
0.947, incorrect wavefunction 0.791, incorrect wavefunc- 
tion HFS: 0.958). We have also performed DMC calcu- 
lations of the Hydrogen density, where we systematically 
deformed the known exact wavefunction. Suffice it to say, 
as for He we again see confirmation of our algorithm. An 
interesting point to add here regards the extent to which 
the wavefunction could be skewed. It turns out - rather 
plausibly - that if the wavfunction ceases to actually sam- 
ple certain parts of phase space HFS cannot recover the 
true form. Nevertheless it seems capable of correcting rel- 
atively strong errors in the wavefunction (viz. the rarely 
sampled asymptoticically decaying part of the wavefunc- 
tion in Fig. (TT])), but the details are clearly a topic for 
further investigation. 




FIG. 1: The "exact" Helium density (solid line) was de- 
rived using the well optimized wavefunction provided by the 
CASINO package: The difference (not shown) between stan- 
dard DMC sampling and HFS is essentially zero (AOf <C 
Of). In addition, results using a trial wavefunction with 
wrong radial function are presented. Standard DMC yields 
a smooth but rather poor density. HFS, while noisier (see 
inset), follows the correct density even in the asymptotic re- 
gion far from the nucleus where despite little information HFS 
corrects for the wrong behaviour. 



As in standard DMC sampling, the worse the trial 
wavefucntion vp^, the larger the noise when using HFS. 
However, when looking at the raw data before averaging 
over i (not shown) we observed that the noise in the HFS 
data rises during the progression of the sampling, hence 
standard error estimation does not work. The source can 
be traced to sampling over histories Xi. Limiting their 
depth results in a constant noise term though also reintro- 
duces a systematic bias. Also, in a recent paper [14j War- 
ren and Hinde observe that using the forward-walking 
method in DMC necessitates a rapidly growing number 
of walkers as the dimensionality of the quantum system is 
increased. These two issues then lead us to the question 
as to whether HFS works for larger systems. We have 
therefore looked at an unpolarized homogeneous electron 
gas at r s = 1. We used a finite simulation cell (periodic 
boundary condition) with 54 electrons. The data we plot 
shows the Ewald interaction energy with no additional 
finite size corrections. We show in Fig. [J] results for a 
fully interacting system that we have obtained by using 
trial wavefunctions with either no Jastrow factor, a par- 
tially optimized Jatrow factor, or a fully optimized one. 
We show the mixed DMC estimate (0)dmc, the cor- 
rected estimate {0) cDM c,= 2(0} daw - (O)vmc which 
contains a second-order error, and the results for HFS. 
The MC runs start at with a short equilibration phase 
and we start sampling at time step 2000. The corrected 
estimate using the fully optimized Jastrow factor ought 
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FIG. 2: Results for the Ewald energy of an unpolarized ho- 
mogeneous electron gas (r a = 1) with 54 electrons. Standard 
DMC, (O)dmc, yields the relatively smooth curves at the 
top (see arrows). The noisier data below use HFS (see arros) 
and the thin streight lines correspond to (0} c dmc at the end 
of the run. The partial Jastrow factor contains a correlation 
term without cusp. 

to give the best result. Clearly all three HFS estimates 
are very close but especially the non-optimized wave- 
functions yield quite noisy data. Nevertheless, even in 
that case the results are a lot better than using the pure 
DMC output for the best wavefunction. However, they 
are all also better than the corrected (0) c dmc results 
of the partially/non-optimized wavefunction. Regarding 
the noise one also has to keep in mind the difficulty of 
the task: The interaction energy is dominated by the re- 
gion where the electrons get close to each other but that 
is where the error of the non-optimized wavefunctions is 
largest. HFS essentially has to build a cusp from scratch. 

Fig. [3] repeats the same analysis for a non-interaction 
Hamiltonian where the Slater determinant (no Jastrow 
factor) is the exact solution, whence the HFS data and 
the standard DMC data in that case being identical. This 
is of course consistent with Eq. ^ and proves that given 
the correct nodes, HFS yields the correct answer. Apart 
from that Fig. [3] is essentially a mirror image of Fig. [2] 
In general, we see that unless the wavefunction is well 
optimized the HFS estimate is considerably better, de- 
spite the noise in the data. Such situations might occur 
when the system is dominated by the bulk while we are 
interested in sampling data in the surface region. Op- 
timization based on the total energy or variance would 
result in a sub-optimal wavefunction away from the bulk 
and hence erroneous standard sampling. 

In conclusion, by applying the HF theorem directly to 
the DMC algorithm we have introduced a new method to 
sample a large class of operators exactly within standard 
DMC. Our method works for both small and large sys- 



FIG. 3: As in Fig. [2] but for an interaction-free Hamiltonian. 
The noisier data at the top have been sampled using the HFS 
estimator (see arrows). Below follow the relatively smooth 
standard DMC values (see arrows) sampling (O)dmc, except 
for the no-Jastrow case where the two estimators yield the 
same data as AOf = 0. The thin streight lines correspond 
to {0) c dmc at the end of the run, except in the case of no 
Jastrow factor where the thin line gives the essentially exact 
VMC value. 



terns and is easy to add to standard DMC, enabling the 
sampling of a large class of operators (densities, interac- 
tion energies, etc.): Only one extra variable per operator 
need be added to the walkers, involving no more 
than an extra summation step during sampling; simple 
algebra (Eqs. (jHJ) and ©) does the rest. Future work 
is needed to better understand, estimate, and deal with 
the noise and its slow increase with simulation time. This 
is currently under investigation. Similarly, the effect of 
residual weights needs to be looked at in more detail. A 
promising line of research already under way is to look 
at the second derivative. This might allow efficient DMC 
sampling of the fixed-node density-response function and 
related quantities, the study of which is currently not fea- 
sible due to being numerically too demanding. 
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